AITopics | gram matrix

Spectral Analysis of Representational Similarity with Limited Neurons

Neural Information Processing SystemsJun-15-2026, 15:39:32 GMT

Understanding representational similarity between neural recordings and computational models is essential for neuroscience, yet remains challenging to measure reliably due to the constraints on the number of neurons that can be recorded simultaneously. In this work, we apply tools from Random Matrix Theory to investigate how such limitations affect similarity measures, focusing on Centered Kernel Alignment (CKA) and Canonical Correlation Analysis (CCA). We propose an analytical framework for representational similarity analysis that relates measured similarities to the spectral properties of the underlying representations. We demonstrate that neural similarities are systematically underestimated under finite neuron sampling, mainly due to eigenvector delocalization. Moreover, for power-law population spectra, we show that the number of localized eigenvectors scales as the square root of the number of recorded neurons, providing a simple rule of thumb for practitioners. To overcome sampling bias, we introduce a denoising method to infer population-level similarity, enabling accurate analysis even with small neuron samples. Theoretical predictions are validated on synthetic and real datasets, offering practical strategies for interpreting neural data under finite sampling constraints.

artificial intelligence, bayesian inference, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

A Martingale Kernel Independence Test

Laumann, Felix, Liu, Zhaolu, Barahona, Mauricio

arXiv.org Machine LearningMay-22-2026

The Hilbert-Schmidt Independence Criterion (HSIC) and its joint-independence extension $d\mathrm{HSIC}$ are degenerate $V$-statistics whose data-dependent weighted-$χ^2$ null limits force a permutation calibration that multiplies the per-test cost by the number of permutations, in practice two orders of magnitude. Adapting the recent martingale MMD construction for two-sample testing to the (joint) independence problem, we introduce two studentised statistics whose null distributions are standard normal regardless of the data law, so that a single normal-quantile lookup replaces the permutation step entirely. The first, $m\mathrm{HSIC}$, is a self-normalised lower-triangular sum of the Hadamard product of two empirically centred Gram matrices. Under independence and bounded-fourth-moment kernels it converges to a standard normal. It is consistent against every fixed alternative, and runs at quadratic cost in the sample size without any sample split, matching the biased HSIC $V$-statistic. Our second statistic, $md\mathrm{HSIC}$, achieves finite-sample consistency with a single half-sample split: the centring is estimated on one half and the lower-triangular self-normalised martingale is run on the other, shrinking the conditional-mean residual to a quantity that is exponentially small in $d$, so the statistic is asymptotically standard normal at every fixed number of jointly tested variables, with a per-test cost that grows only linearly in $d$. On synthetic data with per-variable input dimension from $1$ to $500$ and between $2$ and $10$ jointly tested variables, both statistics match the empirical type-I error rate and test power of permutation-calibrated baselines while running $25$ to $60\times$ faster.

artificial intelligence, hsic, machine learning, (17 more...)

arXiv.org Machine Learning

2605.22549

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Scalable Gaussian process inference via neural feature maps

Stephenson, Anthony

arXiv.org Machine LearningMay-12-2026

We present a theoretically grounded Gaussian process framework that leverages neural feature maps to construct expressive kernels. We show that the learned feature map can be interpreted as an optimal low-rank approximation to a Gram matrix derived from an implied RKHS, from which we establish consistency of the GP posterior. We further analyse the spectral properties of the induced kernels and introduce product feature-map kernels to address oversmoothing. This simple yet powerful approach enables fast, scalable, and accurate exact GP inference with minimal upfront work. The flexibility of kernel design supports seamless application to both regression and classification tasks across diverse data modalities, including tabular inputs and structured domains such as images.

artificial intelligence, machine learning, modeling & simulation, (19 more...)

arXiv.org Machine Learning

2605.10285

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Modeling & Simulation (0.86)
(2 more...)

Add feedback

c4e380fb74dec9da9c7212e834657aa9-Paper-Conference.pdf

Neural Information Processing SystemsApr-29-2026, 17:03:35 GMT

artificial intelligence, communication cost, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

4b5deb9a14d66ab0acc3b8a2360cde7c-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 18:53:31 GMT

What can linearized neural networks actually say about generalization? As mentioned in the main text, all our models are trained using the same scheme which was selected without any hyperparameter tuning, besides ensuring a good performance on CIFAR2 for the neural networks. Namely, we train using stochastic gradient descent (SGD) to optimize a binary crossentropy loss, with a decaying learning rate starting at 0.05 and momentum set to 0.9. Furthermore, we use a batch size of 128and train for a 100epochs. This is enough to obtain close-to-zero training losses for the neural networks, and converge to a stable test accuracy in the case of the linearized models1.

artificial intelligence, eigenfunction, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Koopman Subspace Pruning in Reproducing Kernel Hilbert Spaces via Principal Vectors

Shah, Dhruv, Cortes, Jorge

arXiv.org Machine LearningApr-3-2026

Data-driven approximations of the infinite-dimensional Koopman operator rely on finite-dimensional projections, where the predictive accuracy of the resulting models hinges heavily on the invariance of the chosen subspace. Subspace pruning systematically discards geometrically misaligned directions to enhance this invariance proximity, which formally corresponds to the largest principal angle between the subspace and its image under the operator. Yet, existing techniques are largely restricted to Euclidean settings. To bridge this gap, this paper presents an approach for computing principal angles and vectors to enable Koopman subspace pruning within a Reproducing Kernel Hilbert Space (RKHS) geometry. We first outline an exact computational routine, which is subsequently scaled for large datasets using randomized Nystrom approximations. Based on these foundations, we introduce the Kernel-SPV and Approximate Kernel-SPV algorithms for targeted subspace refinement via principal vectors. Simulation results validate our approach.

approximation, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

2604.01459

Country:

North America > United States > New York (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Neuc-MDS: Non-Euclidean Multidimensional Scaling Through Bilinear Forms

Neural Information Processing SystemsMar-22-2026, 16:38:59 GMT

We introduce \textbf{N}on-\textbf{Euc}lidean-\textbf{MDS} (Neuc-MDS), which extends Multidimensional Scaling (MDS) to generate outputs that can be non-Euclidean and non-metric. The main idea is to generalize the inner product to other symmetric bilinear forms to utilize the negative eigenvalues of dissimiliarity Gram matrices. Neuc-MDS efficiently optimizes the choice of (both positive and negative) eigenvalues of the dissimilarity Gram matrix to reduce STRESS, the sum of squared pairwise error. We provide an in-depth error analysis and proofs of the optimality in minimizing lower bounds of STRESS. We demonstrate Neuc-MDS's ability to address limitations of classical MDS raised by prior research, and test it on various synthetic and real-world datasets in comparison with both linear and non-linear dimension reduction methods.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.79)

Add feedback